-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add test time spines for sub-daily granularity #1358
Conversation
88dbe8b
to
fdcf983
Compare
eb31d9f
to
b12aebb
Compare
b12aebb
to
b1ec4d6
Compare
These are breaking for Trino, so I'm removing them for now. I have a task up to put up equivalent tests in SQL rendering section instead, where I can specify to skip for Trino.
b1ec4d6
to
2ef8de8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick question about the fallback behavior:
What happens when a user adds an HOUR grain time spine but not a DAY? What if they add a MONTH grain time spine but not a DAY?
- ["2020-01-01 00:00:00.000026"] | ||
- ["2020-01-01 00:00:00.000027"] | ||
- ["2020-01-01 00:00:00.000028"] | ||
- ["2020-01-01 00:00:00.000029"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want values from multiple days, even though they won't be contiguous in the input.
I'm not sure if this really matters but I'm always wary of having test data pegged to a boundary (in this case, a year boundary).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough - I can update that tomorrow! Shouldn't impact any of the tests, just will need to repopulate the source schemas.
- name: ts | ||
type: TIME | ||
rows: | ||
- ["2020-01-01 01:00:00"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh. I just realized, are we going to date_trunc the time spine input to the specified grain? I'm pretty sure we don't do it today, but there's a type for it (DATE). The spec calls for the end user to configure that correctly, so I'm inclined not to date_trunc right now, but it might be something we need to do.
Presumably most people are using packages to build these things so maybe we just rely on that. If we're worried but not very worried about this we could also set up a best-effort warehouse validation, or release a validation package for time spine models that people can use if they wish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't apply that DATE_TRUNC
in JoinToTimeSpineNode
or JoinOverTimeRangeNode
, But we do in ReadSqlSourceNode
& MetricTimeDimensionTransformNode
. This feature doesn't change that behavior so far, but we could discuss if we want to change it.
I think the warehouse validations will be a good idea so we can have more efficient queries.
time_spine_sources = { | ||
legacy_time_spine_grain: TimeSpineSource( | ||
schema_name=mf_test_configuration.mf_source_schema, table_name=time_spine_base_table_name | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, does this mean that we're no longer triggering the fallback behavior in the runtime, because we're overriding it with an explicit time spine input?
What happens if I define an HOUR grain time spine but leave the DAY grain one in the original configuration? Does that fail unceremoniously, do we raise an informative error, or do we just use the HOUR spine for everything?
I'm actually fine with raising an informative error or using the HOUR spine, especially for now, I just realized I'm just not clear on what happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The legacy time spine config will only be included in your manifest if you have the metricflow_time_spine
model in your project. It will be overridden if you add configs pointing to a time spine with DAY
. The legacy time spine will be treated like any other time spine you have configured.
If you have a new time spine with HOUR
grain in addition to the legacy DAY
time spine, we'll use the DAY
time spine if you query something with DAY
or higher. In that case we would only use the HOUR
time spine if you query with HOUR
. We'll choose the time spine with the largest grain available that can resolve the requested grain.
type: simple | ||
type_params: | ||
measure: bookings | ||
filter: "{{ TimeDimension('metric_time') }} < '2012-12-20'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I think I know why this is. Timestamp literals in Trino have to take the form TIMESTAMP <literal>
so we'd need to do some substitution somewhere.
If the error is due to the filter it means we have to do custom rendering against the filter expr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. I couldn't come up with an easy way to fix it here and wasn't sure it was worth doing the hard fix for an engine we barely use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I think the right way to deal with this is via a more holistic approach to filter expression inputs. In the meantime, having a small gap in Trino test coverage - particularly one where the main difference essentially boils down to how we go about pasting in the user-provided expression at render time - seems fine to me.
@tlento If they configure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that all makes sense. We should follow up on:
- granularity validations/date_trunc behaviors for time spine inputs (I favor validation, I think, as littering the codebase with date_trunc calls is kind of crappy)
- longer term approach to default fallback interactions. For now this feels like a documentation problem but we may need to refine some of our internals a bit.
I think both of these will come up with custom calendar so we'll have an opportunity to re-visit these in the coming months.
@tlento Sounds good! I'll put up tasks for those in the Wrap Up milestone for this project, and we can move them to the custom calendar project if we see fit later. |
Makes some updates to enable no-metric queries with sub-daily time dimensions and adds integration tests for those queries.
Add test time spines for sub-daily granularity
I added as much data as seemed reasonable to include manually, and figure we can focus tests to those time ranges. And we always have the flexibility to add more data later.
I included one time spine per sub-daily grain so that we could be sure to test the syntax for each grain against each engine that supports it.
Snapshot changes are separated into isolated commits for easier review. They only include the following changes: